8 research outputs found

    Hierarchical cluster guided labeling: efficient label collection for visual classification

    Get PDF
    2015 Summer.Visual classification is a core component in many visually intelligent systems. For example, recognition of objects and terrains provides perception during path planning and navigation tasks performed by autonomous agents. Supervised visual classifiers are typically trained with large sets of images to yield high classification performance. Although the collection of raw training data is easy, the required human effort to assign labels to this data is time consuming. This is particularly problematic in real-world applications with limited labeling time and resources. Techniques have emerged that are designed to help alleviate the labeling workload but suffer from several shortcomings. First, they do not generalize well to domains with limited a priori knowledge. Second, efficiency is achieved at the cost of collecting significant label noise which inhibits classifier learning or requires additional effort to remove. Finally, they introduce high latency between labeling queries, restricting real-world feasibility. This thesis addresses these shortcomings with unsupervised learning that exploits the hierarchical nature of feature patterns and semantic labels in visual data. Our hierarchical cluster guided labeling (HCGL) framework introduces a novel evaluation of hierarchical groupings to identify the most interesting changes in feature patterns. These changes help localize group selection in the hierarchy to discover and label a spectrum of visual semantics found in the data. We show that employing majority group-based labeling after selection allows HCGL to balance efficiency and label accuracy, yielding higher performing classifiers than other techniques with respect to labeling effort. Finally, we demonstrate the real-world feasibility of our labeling framework by quickly training high performing visual classifiers that aid in successful mobile robot path planning and navigation

    RELLIS-3D Dataset: Data, Benchmarks and Analysis

    Full text link
    Semantic scene understanding is crucial for robust and safe autonomous navigation, particularly so in off-road environments. Recent deep learning advances for 3D semantic segmentation rely heavily on large sets of training data, however existing autonomy datasets either represent urban environments or lack multimodal off-road data. We fill this gap with RELLIS-3D, a multimodal dataset collected in an off-road environment, which contains annotations for 13,556 LiDAR scans and 6,235 images. The data was collected on the Rellis Campus of Texas A&M University, and presents challenges to existing algorithms related to class imbalance and environmental topography. Additionally, we evaluate the current state of the art deep learning semantic segmentation models on this dataset. Experimental results show that RELLIS-3D presents challenges for algorithms designed for segmentation in urban environments. This novel dataset provides the resources needed by researchers to continue to develop more advanced algorithms and investigate new research directions to enhance autonomous navigation in off-road environments. RELLIS-3D will be published at https://github.com/unmannedlab/RELLIS-3D

    Evaluating cluster quality for visual data

    No full text
    2013 Spring.Includes bibliographical references.Digital video cameras have made it easy to collect large amounts of unlabeled data that can be used to learn to recognize objects and actions. Collecting ground-truth labels for this data, however, is a much more time consuming task that requires human intervention. One approach to train on this data, while keeping the human workload to a minimum, is to cluster the unlabeled samples, evaluate the quality of the clusters, and then ask a human annotator to label only the clusters believed to be dominated by a single object/action class. This thesis addresses the task of evaluating the quality of unlabeled image clusters. We compare four cluster quality measures (and a baseline method) using real-world and synthetic data sets. Three of these measures can be found in the existing data mining literature: Dunn Index, Davies-Bouldin Index and Silhouette Width. We introduce a novel cluster quality measure as the fourth measure, derived from recent advances in approximate nearest neighbor algorithms from the computer vision literature, called Proximity Forest Connectivity (PFC). Experiments on real-world data show that no cluster quality measure performs "best" on all data sets; however, our novel PFC measure is always competitive and results in more top performances than any of the other measures. Results from synthetic data experiments show that while the data mining measures are susceptible to over-clustering typically required of visual data, PFC is much more robust. Further synthetic data experiments modeling features of visual data show that Davies-Bouldin is most robust to large amounts of class-specific noise. However, Davies-Bouldin, Silhouette and PFC all perform well in the presence of data with small amounts of class-specific noise, whereas Dunn struggles to perform better than random

    A New Iterative Method for Ranking College Football Teams

    No full text
    This paper introduces a new iterative model for ranking college football teams. It is first presented as a general model with a number of parameters. We then introduce two learning methods that use past data to predict the optimal values of the parameters for the model. Our learning algorithms are then implemented using data from 1998-2008. We analyze the accuracy of our rankings by considering bowl game outcomes for each season. We also compare our results with the Bowl Championship Series computer ranking system. We close with a discussion of possible directions for future work.

    Robot navigation from human demonstration: learning control behaviors with environment feature maps

    Full text link
    When working alongside human collaborators in dynamic and unstructured environments, such as disaster recovery or military operation, fast field adaptation is necessary for an unmanned ground vehicle (UGV) to perform its duties or learn novel tasks. In these scenarios, personnel and equipment are constrained, making training with minimal human supervision a desirable learning attribute. We address the problem of making UGVs more reliable and adaptable teammates with a novel framework that uses visual perception and inverse optimal control to learn traversal costs for environment features. Through extensive evaluation in a real-world environment, we show that our framework requires few human demonstrated trajectory exemplars to learn feature costs that reliably encode several different traversal behaviors. Additionally, we present an on-line version of the framework that allows a human teammate to intervene during live operation to correct deteriorated behavior or to adapt behavior to dynamic changes in complex and unstructured environments

    An Intelligence Architecture for Grounded Language Communication with Field Robots

    No full text
    For humans and robots to collaborate effectively as teammates in unstructured environments, robots must be able to construct semantically rich models of the environment, communicate efficiently with teammates, and perform sequences of tasks robustly with minimal human intervention, as direct human guidance may be infrequent and/or intermittent. Contemporary architectures for human-robot interaction often rely on engineered human-interface devices or structured languages that require extensive prior training and inherently limit the kinds of information that humans and robots can communicate. Natural language, particularly when situated with a visual representation of the robot’s environment, allows humans and robots to exchange information about abstract goals, specific actions, and/or properties of the environment quickly and effectively. In addition, it serves as a mechanism to resolve inconsistencies in the mental models of the environment across the human-robot team. This article details a novel intelligence architecture that exploits a centralized representation of the environment to perform complex tasks in unstructured environments. The centralized environment model is informed by a visual perception pipeline, declarative knowledge, deliberate interactive estimation, and a multimodal interface. The language pipeline also exploits proactive symbol grounding to resolve uncertainty in ambiguous statements through inverse semantics. A series of experiments on three different, unmanned ground vehicles demonstrates the utility of this architecture through its robust ability to perform language-guided spatial navigation, mobile manipulation, and bidirectional communication with human operators. Experimental results give examples of component-level behaviors and overall system performance that guide a discussion on observed performance and opportunities for future innovation.</jats:p
    corecore